Ending | Count |
---|---|
байна. | 28546 |
юм. | 18995 |
дээ. | 9325 |
байсан. | 7196 |
байгаа. | 6248 |
билээ. | 6038 |
байлаа. | 5422 |
болно. | 4613 |
байдаг. | 4447 |
бий. | 3963 |
даа. | 3471 |
аж. | 3260 |
байв. | 3050 |
байх. | 2922 |
болсон. | 2805 |
хэрэгтэй. | 2528 |
уу? | 2504 |
вэ? | 2490 |
ёстой. | 2478 |
гэнэ. | 2414 |
шүү. | 2102 |
уу. | 1923 |
биш. | 1795 |
гэдэг. | 1585 |
болжээ. | 1523 |
гэсэн. | 1432 |
байхгүй. | 1428 |
ээ. | 1333 |
боллоо. | 1299 |
үг. | 1265 |
In the next four subsections show the most frequent sentence endings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the end of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', -1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.1 Most Frequent Sentence Beginnings I
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV